Partial Cox regression analysis for high-dimensional microarray gene expression data
نویسندگان
چکیده
MOTIVATION An important application of microarray technology is to predict various clinical phenotypes based on the gene expression profile. Success has been demonstrated in molecular classification of cancer in which different types of cancer serve as categorical outcome variable. However, there has been less research in linking gene expression profile to censored survival outcome such as patients' overall survival time or time to cancer relapse. In this paper, we develop a partial Cox regression method for constructing mutually uncorrelated components based on microarray gene expression data for predicting the survival of future patients. RESULTS The proposed partial Cox regression method involves constructing predictive components by repeated least square fitting of residuals and Cox regression fitting. The key difference from the standard principal components of Cox regression analysis is that in constructing the predictive components, our method utilizes the observed survival/censoring information. We also propose to apply the time-dependent receiver operating characteristic curve analysis to evaluate the results. We applied our methods to a publicly available dataset of diffuse large B-cell lymphoma. The outcomes indicated that combining the partial Cox regression method with principal components analysis results in parsimonious model with fewer components and better predictive performance. We conclude that the proposed partial Cox regression method can be very useful in building a parsimonious predictive model that can accurately predict the survival of future patients based on the gene expression profile and survival times of previous patients. AVAILABILITY R codes are available upon request.
منابع مشابه
Feature Selection and Classification of Microarray Gene Expression Data of Ovarian Carcinoma Patients using Weighted Voting Support Vector Machine
We can reach by DNA microarray gene expression to such wealth of information with thousands of variables (genes). Analysis of this information can show genetic reasons of disease and tumor differences. In this study we try to reduce high-dimensional data by statistical method to select valuable genes with high impact as biomarkers and then classify ovarian tumor based on gene expression data of...
متن کاملAnalysis of additive risk model with high-dimensional covariates using partial least squares.
In this paper, we construct a partial additive regression (PAR) model to predict the survival times of cancer patients based on microarray gene expression data with right censoring. The area under time-dependent receiver operating characteristic curve is used as a model evaluation criterion. We conduct a simulation study to compare the proposed method with other methods, i.e. partial Cox regres...
متن کاملAssessing Patient Survival Using Microarray Gene Expression Data Via Partial Least Squares Proportional Hazard Regression
High dimensional data sets from microarray experiments where the number of variables (genes) p far exceed the number of samples N render most traditional statistical tools of little direct use. However, some of these statistical tools when used in conjunction with an appropriate dimension reduction method can be effective. In this paper we introduce the use the proportional hazard (PH) regressi...
متن کاملPenalized Cox regression analysis in the high-dimensional and low-sample size settings, with applications to microarray gene expression data
MOTIVATION An important application of microarray technology is to relate gene expression profiles to various clinical phenotypes of patients. Success has been demonstrated in molecular classification of cancer in which the gene expression data serve as predictors and different types of cancer serve as a categorical outcome variable. However, there has been less research in linking gene express...
متن کاملPrincipal Component Analysis in Linear Regression Survival Model with Microarray Data
As a useful alternative to the Cox proportional hazards model, the linear regression survival model assumes a linear relationship between the covariates and a known monotone transformation, for example logarithm, of an event time of interest. In this article, we study the linear regression survival model with right censored survival data, when high-dimensional microarray measurements are presen...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Bioinformatics
دوره 20 Suppl 1 شماره
صفحات -
تاریخ انتشار 2004